Path Coding Penalties for Directed Acyclic Graphs
نویسندگان
چکیده
We consider supervised learning problems where the features are embedded in a graph, such as gene expressions in a gene network. In this context, it is of much interest to automatically select a subgraph which has a small number of connected components, either to improve the prediction performance, or to obtain better interpretable results. Existing regularization or penalty functions for this purpose typically require solving among all connected subgraphs a selection problem which is combinatorially hard. In this paper, we address this issue for directed acyclic graphs (DAGs) and propose structured sparsity penalties over paths on a DAG (called “path coding” penalties). We design minimum cost flow formulations to compute the penalties and their proximal operator in polynomial time, allowing us in practice to efficiently select a subgraph with a small number of connected components. We present experiments on image and genomic data to illustrate the sparsity and connectivity benefits of path coding penalties over some existing ones as well as the scalability of our approach for prediction tasks.
منابع مشابه
Supervised feature selection in graphs with path coding penalties and network flows
We consider supervised learning problems where the features are embedded in a graph, such as gene expressions in a gene network. In this context, it is of much interest to automatically select a subgraph with few connected components; by exploiting prior knowledge, one can indeed improve the prediction performance or obtain results that are easier to interpret. Regularization or penalty functio...
متن کاملMatrix Representations and Independencies in Directed Acyclic Graphs By
For a directed acyclic graph, there are two known criteria to decide whether any specific conditional independence statement is implied for all distributions factorized according to the given graph. Both criteria are based on special types of path in graphs. They are called separation criteria because independence holds whenever the conditioning set is a separating set in a graph theoretical se...
متن کاملMatrix representations and independencies in directed acyclic graphs
For a directed acyclic graph, there are two known criteria to decide whether any specific conditional independence statement is implied for all distributions factorizing according to the given graph. Both criteria are based on special types of path in graphs. They are called separation criteria because independence holds whenever the conditioning set is a separating set in a graph theoretical s...
متن کاملPenalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs.
Directed acyclic graphs are commonly used to represent causal relationships among random variables in graphical models. Applications of these models arise in the study of physical and biological systems where directed edges between nodes represent the influence of components of the system on each other. Estimation of directed graphs from observational data is computationally NP-hard. In additio...
متن کاملImproved algorithms for replacement paths problems in restricted graphs
We present near-optimal algorithms for two problems related to finding the replacement paths for edges with respect to shortest paths in sparse graphs. The problems essentially study how the shortest paths change as edges on the path fail, one at a time. Our technique improves the existing bounds for these problems on directed acyclic graphs, planar graphs, and non-planar integer-edge-weighted ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011